Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amdahl code, redux #411

Merged
merged 28 commits into from
Feb 16, 2023
Merged

Conversation

tkphd
Copy link
Contributor

@tkphd tkphd commented Jun 23, 2022

This PR re-opens #407, which was merged prematurely, and relates to #409. Original summary and quoted comments follow.


This PR replaces the dummy hpc-intro-data.tar.gz with a tarballed version of @ocaisa's our amdahl program, hpc-intro-code.tar.gz, and substitutes that black-box program in place of the pi calculator. This is in keeping with our goal of reducing the "programming" aspect of hpc-intro to the bare minimum, focusing instead on the cluster and practicing job submission.

Note that, whereas the upstream amdahl package can be installed as a module, the version archived here is a stand-alone Python file named amdahl, with defaults of 30 seconds of work and a parallel proportion of 85%.

@reid-a and I are planning to teach this version on Friday, 24 June (two days hence tomorrow).


@ocaisa wrote,

I'm a bit surprised by the tarball, in #378 you argued against going that route (for good reasons) so I wonder why to do that now?

@tkphd responded,

Excellent point @ocaisa. The primary answer is "expediency," as Andrew & I want to teach this tomorrow and do not have Modules on the cluster in question. (Its nascent replacement does!)

In #378 you checked in a tarball with a version number, suggesting a desire to iterate on the code. My view is that a tarball is for releases, not development code, which is why I suggested spinning it out into a separate repository.

We also have a tarball for Ep. 15, which the existing lesson uses to work through scp and tar, but the contents are never actually used. Substituting a new tarball allowed me to keep the bones of Ep. 15, but convert it to do something of actual use, i.e., downloading locally, uploading to the cluster, extracting, and running amdahl.

I also made changes to the amdahl[.py] script before tarballing. Some of this is useful and could go upstream, some not. Specifically,

  1. The upstream amdahl.py can't be run stand-alone; I added an if __name__ == "__main__" clause to make it work.
  2. I prepended a #!/usr/bin/env python3 and made it executable.
  3. I dropped the .py extension to enhance the mystique and make it more of a black box.

@ocaisa commented,

The upstream package can be installed with pip3 install --user amdahl and already comes with the amdahl executable as part of that installation, but we can talk about it after your lesson

@tkphd asked,

@ocaisa is that something learners would each do? If so, that would mean explaining pip, which I sure do not want to do. Otherwise, amdahl works because pip sets an alias , or... ?

Edited to update Amdahl repo URL

…-amdahl-code"

This reverts commit bdf8529, reversing
changes made to 86229eb.
@ocaisa
Copy link
Contributor

ocaisa commented Jun 27, 2022

For me there are a few assumptions going on in general when delivering this course, one of those that you have knowledge/control over the environment of your users. The amdahl package is released in the way it is because it can work in a number of ways: if you have modules you can expose it that way (including encoding it's mpi4py dependency), if you don't you can still install it via pip (which also checks/installs the required dependency). Just shipping a tarball has a good chance of not working because there is the assumption that mpi4py is already available in the users environment.

Ultimately this sounds like something that belongs in a snippet as there a selection of available ways to make the code available which are all perfectly valid:

  • Make it available as a module
  • Install it via pip
  • Download the source itself and ensure you have the dep satisfied somehow before running it (may need some minor tweaking to work out-of-the-box as per @tkphd's previous comment above...but I would not drop the .py extension, instead change the command to execute the code)

@tkphd
Copy link
Contributor Author

tkphd commented Jul 6, 2022

Thanks for the thoughtful feedback @ocaisa. My rationalization for the tarball is that Ep. 15 (transferring files) depends on a tarball, and previously, it was repackaged data from a Python lesson that is unrelated and is never used after that episode. This meant a lot of the transfer commands were boilerplate with no concrete filenames or paths, which is mystifying and seems counterproductive. Overhauling Ep. 15 with an amdahl tarball feels right, because it makes the concepts of transferring more concrete and gives them purpose.

You are right, of course, that mpi4py may be unavailable, and having the cluster admin install amdahl as a module or via pip is sensible. Can you update the README/docs in the amdahl repo to outline the recommended workflow for installation, and usage after it's installed?

I can revise the tarball in this PR to contain a driver script that will simply import amdahl and run it, so we can still work through the curl/wget, tar, and scp/rsync commands in Ep. 15, then run it in Ep. 16.

@tkphd
Copy link
Contributor Author

tkphd commented Sep 1, 2022

In today's Coworking Hours, we had a vigorous debate about how to install amdahl. The best/obvious option is to python3 -m pip install amdahl, which will pull in mpi4py if it's not already installed. Possible breakage modes include

  1. The login node is firewalled, and pip can't reach PyPI to get the code.
  2. The PyPI version of mpi4py is incompatible with the installed MPI libraries.

Can folks with access try to install amdahl on their login nodes, and check whether it successfully runs?

python3 -m pip install amdahl
mpirun -np 4 amdahl

Thanks!

@ocaisa
Copy link
Contributor

ocaisa commented Sep 13, 2022

I read the notes of the coworking hour where this was discussed and I still think the easiest is to go with a pip install approach.

As mentioned in those notes, it is possible to do an offline pip install if you have the tarball available. You can get the latest version of the tarball with the URL https://github.com/hpc-carpentry/amdahl/tarball/main so there would be no need to maintain a versioned tarball in the lesson.
With a tarball on the system (via curl/wget, plus scp if needed), let's just assume that the $HOME directory is on a shared FS (usually pretty reasonable assumption). Installation would be:

python -m pip install --user <tarball>

This doesn't resolve any dependency problem you may have, but the most problematic one for our use case is MPI4PY. This can be installed with

env MPICC=/path/to/mpicc python -m pip install --user mpi4py

While I haven't tested it, I imagine that mpi4py also be replaced by a tarball in this command. The MPICC environment variable is only really needed if mpicc is not in your environment (or maybe you need a particular MPI C compiler).

@tkphd tkphd marked this pull request as draft September 15, 2022 22:00
@tkphd
Copy link
Contributor Author

tkphd commented Sep 15, 2022

Following this month's meetings, I have a few things to change on this branch before it's ready to merge -- mostly, teaching pip as the installation method of choice with some caveats for mpi4py and offline installation instructions.

…-amdahl-code"

This reverts commit bdf8529, reversing
changes made to 86229eb.
@tkphd
Copy link
Contributor Author

tkphd commented Jan 19, 2023

Using the Amdahl/MPI4PY tarballs to exercise "transferring files," make sure to cover the cluster configuration firewalls allowing users to "pull" from inside, "push" from outside, or both.

@tkphd tkphd marked this pull request as ready for review January 19, 2023 20:59
@tkphd tkphd force-pushed the amdahl-code-redux branch 2 times, most recently from 144736e to 16ed34a Compare January 19, 2023 21:56
Copy link
Contributor

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hpc-intro-code.tar.gz is still being shipped with this PR

_episodes/15-transferring-files.md Outdated Show resolved Hide resolved
_episodes/15-transferring-files.md Outdated Show resolved Hide resolved
_episodes/15-transferring-files.md Outdated Show resolved Hide resolved
_episodes/15-transferring-files.md Outdated Show resolved Hide resolved
_episodes/15-transferring-files.md Outdated Show resolved Hide resolved
_episodes/16-parallel.md Outdated Show resolved Hide resolved
_episodes/16-parallel.md Outdated Show resolved Hide resolved
_episodes/16-parallel.md Show resolved Hide resolved
_episodes/16-parallel.md Show resolved Hide resolved
_episodes/17-resources.md Outdated Show resolved Hide resolved
@tkphd
Copy link
Contributor Author

tkphd commented Jan 31, 2023

I think hpc-intro-code.tar.gz is still being shipped with this PR

Weird, I thought 8a6f91b deleted it. Taken care of in decb121, which also consolidates the lingering pi code in a new tarball.

@tkphd
Copy link
Contributor Author

tkphd commented Feb 6, 2023

Maybe tools like https://www.parallelpython.com/ , https://www.dask.org/ and https://github.com/pgiri/dispy should be mentioned? Not all people will need MPI, and these might provide good alternatives when task farming is the main purpose.

Out of scope for this PR. Please file a separate issue or, even better, a PR setting MPI in context with these other tools/approaches, bearing in mind that there may be controversy over whether task farming is HPC (vs. HTC).

@tkphd tkphd requested review from wirawan0, bkmgit and ocaisa and removed request for wirawan0 and bkmgit February 16, 2023 01:47
@tkphd
Copy link
Contributor Author

tkphd commented Feb 16, 2023

I believe the requested changes have been made. Please re-review.

@tkphd tkphd requested review from wirawan0 and bkmgit and removed request for bkmgit and wirawan0 February 16, 2023 12:05
Copy link
Contributor

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

_episodes/16-parallel.md Outdated Show resolved Hide resolved
Finally, we'll ensure the `my_pi` through `print` lines only run on rank 0.
Otherwise, every parallel processor will print its local value,
and the report will become hopelessly garbled:
Then submit your job. Note that the submission command has not really changed
Copy link
Contributor

@ocaisa ocaisa Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create an issue for this later, but this is not necessarily true

@ocaisa ocaisa merged commit 3117db8 into carpentries-incubator:gh-pages Feb 16, 2023
@ocaisa
Copy link
Contributor

ocaisa commented Feb 16, 2023

Many many thanks @tkphd for putting so much effort into this PR!

@tkphd tkphd deleted the amdahl-code-redux branch February 16, 2023 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants